Teaching “Unstructured Information Management: Theory and Applications” to Computational Linguistics Students
نویسندگان
چکیده
Students in Computational Linguistics often lack experience in building robust and scalable software components. Thus, student projects tend to be unstable and to work only under very special preconditions (e.g., a project has to be installed in a certain directory, or handles only single files instead of whole directories). Furthermore, if students have to build a system from scratch, they have to concentrate on input and output issues, as well as connecting numerous preprocessing components that were not designed to work together. This limits the scope of feasible course tasks to relatively simple ones like implementing yet another tokenizer. When offering the course “Unstructured Information Management: Theory and Applications”1 as part of the B.A./M.A. program of International Studies in Computational Linguistics at the University of Tübingen, our motivation was to familiarize students with fundamental concepts in unstructured information management and Natural Language Processing (NLP) middleware. This should enable students of computational linguistics to work on more challenging tasks, and to gain first experiences with building complex software systems. The course goals were supported by providing basic preprocessing components like a tokenizer or a PoSTagger on the basis of the Unstructured Information Management Architecture (UIMA) (Ferrucci and Lally, 2004). Thus, students of computational linguistics can concentrate on their core competence and work on more challenging tasks both in terms of theoretical complex-
منابع مشابه
Application of Learning Theories in Clinical Education
Introduction: The purpose of education is learning. Several theories have been raised about learning, which have tried to explain how learning occurs. They help teachers to choose teaching methods, prepare learning environment and determine students' activities. Given the importance of learning theories in education, this study aimed to review application of learning theories in nursing educati...
متن کاملStrategies for Teaching “Mixed” Computational Linguistics Classes
Many of the computational linguistics classes at Ohio State draw a diverse crowd of students, who bring different levels of preparation to the classroom. In the same classroom, we often get graduate and undergraduate students from Linguistics, Computer Science, Electrical Engineering and other departments; teaching the same material to all of these students presents an interesting challenge to ...
متن کاملUsing GATE As An Environment For Teaching NLP
In this paper we argue that the GATE architecture and visual development environment can be used as an effective tool for teaching language engineering and computational linguistics. Since GATE comes with a customisable and extendable set of components, it allows students to get hands-on experience with building NLP applications. GATE also has tools for corpus annotation and performance evaluat...
متن کاملSecond Workshop on Effective Tools and Methodologies for Teaching Natural Language Processing and Computational Linguistics
In Fall 2004 I introduced a new course called Applied Natural Language Processing, in which students acquire an understanding of which text analysis techniques are currently feasible for practical applications. The class was intended for interdisciplinary students with a somewhat technical background. This paper describes the topics covered and the programming exercises, emphasizing which aspec...
متن کاملSome reflections on the teaching of CAT
Synopsis Information processing is both a tool for the professional translator and an area of interest to translators. This implies 2 types of teaching for data processing: knowledge of the field and know-how. The use of machine aids to translation has to respect 2 fundamental principles : the techniques are to be used in the service of content and the document has to be thought as electronic. ...
متن کامل